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Abstract 

In the last decades the Moore-Penrose pseudoinverse has found a wide range of applications in many areas of Science 
and became a useful tool for physicists dealing, for instance, with optimization problems, with data analysis, with the 
solution of linear integral equations, etc. The existence of such applications alone should attract the interest of students 
and researchers in the Moore-Penrose pseudoinverse and in related subjects, like the singular values decomposition 
theorem for matrices. In this note we present a tutorial review of the theory of the Moore-Penrose pseudoinverse. We 
, present the first definitions and some motivations and, after obtaining some basic results, we center our discussion 
OA ' on the Spectral Theorem and present an algorithmically simple expression for the computation of the Moore-Penrose 
pseudoinverse of a given matrix. We do not claim originality of the results. We rather intend to present a complete 
, and self-contained tutorial review, useful for those more devoted to applications, for those more theoretically oriented 
and for those who already have some working knowledge of the subject. 



> 
(N 
00 

oo 



1 Introduction, Motivation and Notation 



(~ l ^ In this paper we present a self-contained review of some of the basic results on the so-called Moore-Penrose pseudoin- 
verse of matrices, a concept that generalizes the usual notion of inverse of a square matrix, but that is also applicable 
to singular square matrices or even to non-square matrices. This notion is particularly useful in dealing with certain 
linear least squares problems, as we shall discuss in Section[Sl i.e., problems where one searches for an optimal approx- 
imation for solutions of linear equations like Ax — y, where A is a given mx n matrix, j/ is a given column vector with 
m components and the unknown x, a column vector with n components, is the searched solution. In many situations, 
a solution is non-existing or non-unique, but one asks for a vector x such that the norm of the difference Ax — y is the 
smallest possible (in terms of least squares). 

Let us be a little more specific. Let A £ Mat (C, m, n) (the set of all complex m x n matrices) and y G C" be 
given and consider the problem of finding x € C" satisfying the linear equation 



Ax = y. (1) 



, \im — n and A has an inverse, the (unique) solution is, evidently, x — A~^y. In other cases the solution may not exist 
• or may not be unique. We can, however, consider the alternative problem of finding the set of all vectors x' £ C" such 
that the Euclidean norm \\Ax' — y\\ reaches its least possible value. This set is called the minimizing set of the linear 
problem U]). Such vectors x' G C" would be the best approximants for the solution of ([1]) in terms of the Euclidean 
norm, i.e., in terms of "least squares". As we will show in Theorem 16.11 the Moore-Penrose pseudoinverse provides 
this set of vectors x' that minimize \\Ax' ~ y\\: it is the set 

X ■ |A+y + (1„ - A+A)2, zGC") , (2) 

where G Mat (C, n, m) denotes the Moore-Penrose pseudoinverse of A. An important question for applications is 
to find a general and algorithmically simple way to compute A+ . The most common approach uses the singular values 
decomposition and is described in Appendix [B] Using the Spectral Theorem and Tikhonov's regularization method 
we show that A""" can be computed by the algorithmically simpler formula 



s 1 /' ^ 



1 = 1 



W (a*a-ai. 



1 = 1 
l^b 



A* , (3) 



where A* denotes the adjoint matrix of A and Pt, k = 1, . . . , s, are the distinct eigenvalues of A* A (the so-called 
singular values of A). See Theorem 15. II for a more detailed statement. One of the aims of this paper is to present a 
proof of (Ul by combining the spectral theorem with the a regularization procedure due to Tikhonov [HIS]. 
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Some applications of the Moore-Penrose pseudoinverse 

Problems involving the determination of the minimizing set of ([1]) are always present when the number of unknowns 
exceeds the number of values provided by measurements. Such situations occur in many areas of Applied Mathematics, 
Physics and Engineering, ranging from imaging methods, like MRI (magnetic resonance imaging) [8] [9l [10], fMRI 
(functional MRI) [121111] . PET (positron emission tomography) [16U17[ and MSI (magnetic source imaging) [13II14|[T5] . 
to seismic inversion problems [181I19| . 

The Moore-Penrose pseudoinverse and/or the singular values decomposition (SVD) of matrices (discussed in Ap- 
pendix [Bj are also employed in data analysis, as in the treatment of electroencephalographic source localization [24[ 
and in the so-called Principal Component Analysis (PC A). Applications of this last method to astronomical data 
analysis can be found in [211 1201 1221 123| and applications to gene expression analysis can be found in [251 [26] . Image 
compression algorithms using SVD are known at least since [27] and digital image restoration using the Moore-Penrose 
pseudoinverse have been studied in [281I29| . 

Problems involving the determination of the minimizing set of ([1} also occur, for instance, in certain numerical 
algorithms for finding solutions of linear Fredholm integral equations of the first kind: 




k{x, y) u{y) dy = f{x) , 



where — oo < a < fe < oo and where k and / are given functions. See Section [4] for a further discussion of this issue. 
For an introductory account on integral equations, rich in examples and historical remarks, see [30] . 

Even this short list of applications should convince a student of Physics or Applied Mathematics of the relevance of 
the Moore-Penrose pseudoinverse and related subjects and our main objective is to provide a self-contained introduction 
to the required theory. 



Organization 

In Section[2]we present the definition of the Moore-Penrose pseudoinverse and obtain its basic properties. In Section[3] 
we further develop the theory of the Moore-Penrose pseudoinverses. In Section[4]we describe Tikhonov's regularization 
method for the computation of Moore-Penrose pseudoinverses and present a first proof of existence. Section [5] collects 
the previous results and derives expression Q, based on the Spectral Theorem, for the computation of Moore- 
Penrose pseudoinverses. This expression is algorithmically simpler than the usual method based on the singular values 
decomposition (described in Appendix[B]). In Section[6]we show the relevance of the Moore-Penrose pseudoinverse for 
the solution of linear least squares problems, its main motivation. In Appendix [X] we present a self-contained review 
of the results on Linear Algebra and Hilbert space theory, not all of them elementary, that we need in the main part 
of this paper. In Appendix [B] we approach the existence problem of the Moore-Penrose pseudoinverse by using the 
usual singular values decomposition method. 

Notation and preliminary definitions 

In the following we fix the notation utilized throughout the paper. We denote C" the vector space of all n-tuples of 
complex numbers: C" := < : , with Zk £ C for all fc = 1, . . . , n >. We denote the usual scalar product in C" by 



, •)q or simply by (•, •), where for 2 = : e C" and w = : £ C", we have 



{z, w)^ = (2, w) ■- ^ 



ZkWk 



fc=l 



Note that this scalar product is linear in the second argument and anti-linear in the first, in accordance with the 
convention adopted in Physics. Two vectors u and v £ are said to be orthogonal according to the scalar product 
{■, ■) if (u, u) = 0. If C C" is a subspace of C" we denote by the subspace of C" composed by all vectors 
orthogonal to all vectors of W. The usual norm of a vector 2 £ C" will be denoted by ||2||c or simply by ||2|| and is 
defined by ||2||c = ||2|| = ^(2, x). It is well known that C" is a Hilbert space with respect to the usual scalar product. 

The set of all complex m x n matrices (m rows and n columns) will be denoted by Mat (C, m, n). The set of all 
square n x n matrices with complex entries will be denoted by Mat (C, n). 

The identity matrix will be denoted by 1. Given A G Mat (C, m, n) we denote by A'^ element of Mat (C, n, m) 
whose matrix elements are {A'^)ij = Aji for all i £ {1, . . . , n}, j £ {1, . . . , m}. The matrix A"^ is said to 
be the transpose of A. It is evident that (A^)^ = A and that {AB)"^ = B^A^ for aU A £ Mat (C, m, n) and 
B £ Mat (C, n, p). 
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If A G Mat (C, m, n), then its adjoint A* £ Mat (C, n, m) is defined as the matrix whose matrix elements {A*)ij 
are given by Aji for all < i < n and < j < m. 

Given a set ai, . . . , an of complex numbers we denote by diag (ai, . . . , an) G Mat (C, n) the diagonal matrix 
whose fc-th diagonal entry is a^: 

/ / s\ / cti, for i = j , 

(diag(ai, a„))^^. = | for ^ / . 

The spectrum of a square matrix A £ Mat (C, n) coincides with the set of its eigenvalues (see the definitions in 
Appendix [AJ and will be denoted by a (A). 

We denote by <S)a, t £ Mat (C, a, b) the ax b whose matrix elements are all zero. We denote by 1; G Mat (C, I) the 
I X I identity matrix. If no danger of confusion is present, we will simplify the notation and write and 1 instead of 0^, b 
and li, respectively. We will also employ the following definitions: for m, n € let 7m, m+n £ Mat (C, m, m + n) 
and Jm+n, n € Mat (C, m + n, n) be given by 



^ m, m + n 



and 



J TT 



In 



The corresponding transpose matrices are 

1 T 



^m, m~^n 



Im 



and 



(J. 



m-|-n, n ) 



• — (in ©n,m) — 



The following useful identities will be used bellow: 



-^m, m+n iylm, m+n) — -^m, m+nt/m+n, m — 

(<^m + n, n) Jm-\-n. n — -^n, m+n-^m + n, n — In , 

For each A £ Mat (C, m, n) we can associate a square matrix A' £ Mat (C, m + n) given by 

A 



/'r A( 7 — T AT — I ^ (Dm,m\ 

y^tn, m-\-n ) ^^"m + n, ny <^m + n, m^-'n, m + n 1 I 

\ «>n, n 'Un, m y 



As one easily checks, we get from ©-((T]) the useful relation 

A = 7m, m-'rnA Jm + n. n 

The canonical basis of vectors in C" is 



ei 



Let , . . . , a:" be vectors, represented in the canonical basis as 



(4) 

(5) 

(6) 
(7) 

(8) 

(9) 


















1 





















, . . . , Gri — 









w 




^1/ 



(10) 



\-^n/ 



We will denote by . . . , a;"| the n x n constructed in such a way that its a-th column is the vector x°' 

means, 

1 



X 



X = 



V-^n 



X. 



It is obvious that 1 = |ei, . . . , en||. With this notation we write 

bIx\ x"} = Ibx\ Bx" 



that 



(11) 



(12) 
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for any B £ Mat (C, m, n), as one easily checks. Moreover, if D is a diagonal matrix D = diag (di, . . . , then 

a;"J -D = |dia;\ d„a;"| . (13) 

If Wi, Vk are elements of a complex vector space V, we denote by [ni, Vk] the subspace generated 

Di, Vk, i.e., the collection of all linear combinations of the vi, v^- [vi, Vk] ■= + ■■■ + 

QfcUfc, Qi, . . . , afc G c|. 

More definitions and general results can be found in Appendix lAl 



2 The Moore-Penrose Pseudoinverse. Definition and First Prop- 
erties 

In this section we define the notion of a Moore-Penrose pseudoinverse and study its uniqueness. The question of the 
existence of the Moore-Penrose pseudoinverse of a given matrix is analyzed in other sections. 

Generalized inverses, or pseudoinverses 

Let m, n G IN and let A £ Mat (C, m ,n) be a m x n matrix (not necessarily a square matrix). A matrix B £ 
Mat (C, n, m) is said to be a generaltzed inverse, or a pseudoinverse, of A if it satisfies the following conditions: 

1. ABA = A, 

2. BAB = B. 

If A G Mat (C, n) is a non-singular square matrix, its inverse A~^ satisfies trivially the defining properties of the 
generalized inverse above. We will prove later that every matrix A G Mat (C, m , n) has at least one generalized 
inverse, namely, the Moore-Penrose pseudoinverse. The general definition above is not enough to guarantee uniqueness 
of the generalized inverse of any matrix A G Mat (C, m ,n). 

The definition above is too wide to be useful and it is convenient to narrow it in order to deal with certain specific 
problems. In what follows we will discuss the specific case of the Moore-Penrose pseudoinverse and its application to 
optimization of linear least squares problems. 



Defining the Moore-Penrose pseudoinverse 

Let m, n £\t\ and let A G Mat (C, m ,n). A matrix A'^ G Mat (C, n, m) is said to be a Moore-Penrose pseudoinverse 
of A if it satisfies the following conditions: 

1. AA+A = A, 

2. A+AA+ = A+, 

3. AA'^ G Mat (C, m) and A'^A G Mat (C, n) are self-adjoint. 

It is easy to see again that if yl G Mat (C, n) is non-singular, then its inverse satisfies all defining properties of a 
Moore-Penrose pseudoinverse. 

The notion of Moore-Penrose pseudoinverse was introduced by E. H. Moore [L in 1920 and rediscovered by R. 
Penrose ^5^, 7 in 1955. The Moore-Penrose pseudoinverse is a useful concept in dealing with optimization problems, 
as the determination of a "least squares" solution of linear systems. We will treat such problems later (see Theorem 
I6.1|l , after dealing with the question of uniqueness and existence of the Moore-Penrose pseudoinverse. 



The uniqueness of the Moore-Penrose pseudoinverse 

We will first show the uniqueness of the Moore-Penrose pseudoinverse of a given matrix A G Mat (C, m, n), assuming 
its existence. 

Let G Mat (C, n, m) be a Moore-Penrose pseudoinverse A G Mat (C, m, n) and let B G Mat (C, n, m) be 
another Moore-Penrose pseudoinverse of A, i.e., such that ABA = A, BAB = B with AB and BA self-adjoint. Let 
Ml := AB - AA^ = A(B - A^\ G Mat (C, m). By the hypothesis. Mi is self-adjoint (since it is the difference of two 
self-adjoint matrices) and (Mi)^ = (AB - AA+)A{B - A+) = {ABA ~ AA+A) [B - .4+) ={A~ A){B- A+) = 0. 
Since Mi is self-adjoint, the fact that (Mi)'^ = implies that Mi — 0, since for all x G C" one has ||Mia::||^ = 
(Mix, Mix) = (x, (Mi)^x) = 0, implying Mi = 0. This showed that AB — AA'^ . Following the same steps we can 
prove that BA = A'^A (consider the self-adjoint matrix M2 := BA~ A'^A G Mat (C, n) and proceed as above). Now, 
aU this implies that A+ = A+AA+ = 4+(AA+) = A+ AB = {A+A)B = BAB = B, thus establishing uniqueness. 
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As we already commented, ii A £ Mat (C, n) is a non-singular square matrix, its inverse A~^ trivially satisfies the 
defining conditions of the Moore-Penrose pseudoinverse and, therefore, we have in this case A^ = A~^ as the unique 
Moore- Penrose pseudoinverse of A. It is also evident from the definition that for Omu, the m x n identically zero 
matrix, one has (Omn)^ — Onm- 



Existence of the Moore-Penrose pseudoinverse 

We will present two proofs of the existence of the Moore-Penrose pseudoinverse A'^ for an arbitrary matrix A £ 
Mat (C, m, n). Both proofs produce algorithms for the explicit computation of A'^ . The first one will be presented 
in Section |4] (Theorems 14.31 and I5.1|l and will follow from results presented below. Expressions (|39} and (|40|) furnish 
explicit expressions for the computation of A'^ in terms of A, A* and the eigenvalues of AA* or A* A (i.e., the singular 
values of A). 

The second existence proof will be presented in Appendix |B] and relies on the singular values decomposition 
presented in Theorem IA.16I For this proof, the preliminary results presented below are not required. This second 
proof is the one more frequently found in the literature, but we believe that expressions (|39|l and (|4Up provide an 
algorithmically simpler way for the determination of the Moore-Penrose pseudoinverse of a given matrix. 



Computing the Moore-Penrose pseudoinverse in some special cases 

If j4 G Mat (C, m, 1), j4 = I : j , a non-zero column vector, then one can easily verify that A^ — j^^A* = 



0, z = 



(°i . a„, ), where ||y4|| = \/|aip + ■ ■ ■ + |a„i|2. In particular, if z G C, then (2)^ = •j^ 1' ^ g 1 by taking z 

as an element of Mat (C, 1, 1). 

This can be further generalized. Ii A £ Mat (C, m, n) and (AA*)"^ exists, then 

A+ = A*{AA'')~'^ , (14) 

because we can readly verify that the r.h.s. satisfies the defining conditions of A^ . Analogously, if {A*A)~^ exists, 
one has 

A+ = {A*Ay'^A* . (15) 
For instance, for ^ = ( ? 1 ) one can check that AA* is invertible, but A* A is not, and we have A'^ — A* (Ayl*) ^ — 

i ^ 1 -5i^ . Similarly, for A = singular, but A* A is invertible and we have A"*" = (A* A) ^A* = 

j_ / 10 2i -6 \ 
10 ^ -i 3 / ■ 

The relations (|14 [) - (|15p are significant because they will provide an important hint to find the Moore-Penrose 
pseudoinverse of a general matrix, as we will discuss later. In Proposition 13.21 we will show that one has in general 
A+ = A*{AA*y = {A* Ay A* and in Theorem [4:31 we will discuss what can be done in the cases when A* A or A* A 
are not invertible. 



3 Further Properties of the Moore-Penrose Pseudoinverse 

The following properties of the Moore-Penrose pseudoinverse follow immediately from its definition and from unique- 
ness. The proofs are elementary and left to the reader: for any A G Mat (C, m, n) one has 

1. (A+)+=A, 

2. (A+)^ = (A^) + , 1+ = (A)+ and, consequently [A+Y = (A*)^ , 

3. {zA)+ = for all zeC, z^ 0. 

It is however important to remark that for A £ Mat (C, m, n) and B £ Mat (C, n, p), the Moore-Penrose pseudoin- 
verse {ABY is not always equals to B'^A^, in contrast to what happens with the usual inverse in the case m — n = p. 
A relevant exception will be found in Proposition 13.21 

The next proposition lists some important properties that will be used below. 
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Proposition 3.1 The Moore-Penrose pseudomverse satisfies the following relations: 



A+ 


= A+{A+yA*, 


(16) 


A 


= AA*{A+y, 


(17) 


A 


= AAA, 


(18) 


A+ 


= A'{A+yA+, 


(19) 


A 


= [A+YA'A, 


(20) 


A* 


= A+AA', 


(21) 






□ 



valid for all A G Mat (C, m, n). 

For us, the most relevant of the relations above is relation (|18|) . since we will make use of it in the proof of 
Proposition 16. II we when deal with optimization of least squares problems. 

Proof of Proposition 13.11 Since AA'^ is self-adjoint, one has AA'*' = [AA'^)* = [A'*')* A* . Multiplying to the left by 
A'^ , we get A'^ — A'^ [A'^)* A* , proving (|16() . Replacing A — >■ and using the fact that A — (A^)^, one gets from 
([Tell ^ = AA*(A'^)* , which is relation JlTll. Replacing A ^ A* and using the fact that [A*)^ = (A+)*, we get from 
ifTTjl that A* = A*AA'^, which is relation (fT8)l . 

Relations (|19|l - (|2ip can be obtained analogously from the fact that A'^A is also self-adjoint, but they follow more 
easily by replacing A A* in (|16p - (|18p and by taking the adjoint of the resulting expressions. H 



From Proposition l3.1l other interesting results can be obtained, some of which are listed in the following proposition: 
Proposition 3.2 For all A £ Mat (C, m, n) one has 

{AA*y = {A*)+A+ . (22) 

From this we get 

A+ = yl*(A^*)+ = {A*A)^A* , (23) 

also valid for all A £ Mat (C, m, n). □ 

Expression ((23} generalizes (|14p - p5p and can be employed to compute A'^ provided (^AA*^^ or (^A'A)^ were 
previously known. 

Proof of Proposition [3l2] Let B = (A*)^ A+ . One has 

AA' ^ AA'iA+yA* AA* {A+y A+ AA* = iAA*)B{AA'') , 
where we use that (A*)^ = (^4^)*. One also has 

B = {A*)+A+ ^ {A+yA+AA+ ^ {A+)* A+ AA* {A+y A+ = B{AA'')B . 

Notice that 

{AA*)B = (^AA*{A+y)A+ ^ AA+ 
which is self-adjoint, by definition. Analogously, 

B{AA') = {A+y(^A+AA*) ^{A*)+A*, 

which is also self-adjoint. The facts exposed in the lines above prove that B is the Moore-Penrose pseudoinverse of 
AA* , establishing (|22[) . Replacing A ^ A* in (|22|) . one also gets 

{A'A)^ = A+{A')^ . (24) 

Notice now that 

A*{AAy ^A*{A')'-A+ ^A+ 
and that 

iA*A)-A* ^A-iArA* ^A\ 
establishing (|23|) . H 
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The kernel and the range of a matrix and the Moore-Penrose pseudoinverse 



The kernel and the range (or image) of a matrix A £ Mat (C, m, n) are defined by Ker (A) := {u £ C"\ Au = 0} and 
Ran (A) := {Au, u £ C"}, respectively. It is evident that Ker (A) is a linear subspace of C" and that Ran (A) is a 
Unear subspace of C" . 

The following proposition will be used below, but is interesting by itself. 

Proposition 3.3 Let A e Mat (C, m, n) and let us define Pi := t„ - A'^ A G Mat (C, n) and P2 := Im - AA'^ G 
Mat (C, n). Then, the following claims are valid: 

1. Pi and P2 are orthogonal projectors, that means, they satisfy (Pk)^ = Pk and P^ — Pk, k — 1, 2. 

2. Ker (A) = Ran (Pi), Ran (A) = Ker (Pa), Ker {A+) = Ran (P2) and Ran {A+) = Ker (Pi). 

3. Ran (A) = Ker {A+)^ and Ran {A+) = Ker (A)^ . 

4- Ker (A) ® Ran [A'^) = C" and Ker [A'^) ® Ran (A) = C™, both being direct sums of orthogonal subspaces. □ 

Proof. Since AA'^ and A'^A sire self-adjoint, so are Pi and P2. One also has (Pi)^ = 1 - 2A+yl + A+AA+A = 
1 - 2A^A + A = 1 - A^ A = Pi and analogously for P2. This proved item[Tl 

Let X G Ker (A). Since Ran (Pi) is a closed linear subspace of of C", the "Best Approximant Theorem", Theorem 
lA.ll and the Orthogonal Decomposition Theorem, Theorem IA.3I guarantee the existence of a unique zo G Ran (Pi) 
such that ||a:: — zo\\ — min {||x — z\\, z G Ran (Pi)}. Moreover, x — zo is orthogonal to Ran (Pi). Hence, there exists 
at least one yo G C™ such that x — Pij/o is orthogonal to every element of the form Piy, i.e., {x — Pij/o, Piv) ~ 
for all y G C"\ what implies {Pi{x — Piyo), y) — for all y G C™ what, in turn, implies Pi (a; — Pii/o) = 0. This, 
however, says that Pix = Pij/o- Since x G Ker (A), one has Pi a; = x (by the definition of Pi). We therefore proved 
that if a; G Ker (A) then x G Ran (Pi), establishing that Ker (A) C Ran (Pi). On the other hand, the fact that 
APi = A(l - A+A) = A- A^O implies Ran (Pi) C Ker [A), establishing that Ran (Pi) = Ker (A). 

If z G Ker (Pi), then z = A^Az, proving that z G Ran This established that Ker (Pi) C Ran On the 

other hand, if G Ran (A+) then there exists u G such that u = A^v. Therefore, PiM = (l„ - y4+yl) A+n = (A+ - 
A+AA+)v = 0, proving that u G Ker (Pi) and that Ran {A+) C Ker (Pi). This established that Ker (Pi) = Ran {A+) . 

P2 is obtained from Pi by the substitution A — > A'^ (recalling that (A"*") = A). Hence, the results above imply 
that Ran (P2) = Ker {A+) and that Ker (P2) = Ran (A). This proves item [21 

If M G Mat (C, p) (with p G IhJ, arbitrary) is self-adjoint, that {y, Mx) = {My, x) for aU x, y £ C. This relation 
makes evident that Ker (M) — Ran(M)^. Therefore, item [3] follows from item [2] by taking M = Pi and M = P2. 
Item |4] is evident from itemO H 



4 Tikhonov's Regularization and Existence Theorem for the Moore- 
Penrose Pseudoinverse 

In (flil) and (flS)) we saw that if [AA*)^^ exists, then A^ = A''[AA*y^ an that if [A* A)~^ exists, then yl+ = 
{A* Ay A*. If those inverses do not exist, there is an alternative procedure to obtain A'^ . We know from Proposition 
I A. 41 that even if (AA*) ^ does not exist, the matrix AA* + /il will be invertible for all non- vanishing /i G C with |^| 

small enough. Hence, we could conjecture that the expressions A* (^AA* + ^J,l) ^ and {^A* A-\- iil) ^ A* are well-defined 
for /X 7^ and small enough and converge to A'^ when the limit ^ — > is taken. As will now show, this conjecture 
is correct. 

The provisional replacement of the singular matrices A A* or A* A by the non-singular ones A A* -f- /il or A* A + /^l 
(with /i 7^ and "small") is a regularization procedure known as Tikhonov's regularization. This procedure was 
introduced by Tikhonov in [4] (see also [5] and, for historical remarks, 30 ) in his search for uniform approximations 
for the solutions of Fredholm's equation of the first kind 

6 

k{x, y)u{y) dy = f{x) , (25) 

where —00 < a < & < 00 and where k and / are given functions satisfying adequate smoothness conditions. In 
operator form, (|25p becomes Ku = / and K is well known to be a compact operator (see, e.g., ^Bj) if is a continuous 
function. By using the method of finite differences or by using expansions in terms of orthogonal functions, the inverse 
problem (|25l) can be replaced by an approximating inverse matrix problem Ax = y, like ((T}. By applying A* to the 
left, one gets A* Ax — A*y. Since the inverse of A* A may not exist, one first considers a solution x^ of the regularized 
equation (^A* A + fil)xf^ = A'y, with some adequate ^ G C, and asks whether the limit lim|j,|_^o {A* A + /^l) ^A'y 



7 



can be taken. As we will see, the limit exists and is given precisely by A'^y. In Tikhonov's case, the regularized 
equation (^A*A + nl^x^ = A*y can be obtained from a related Fredholm's equation of the second kind, namely 
K* KUfi + ^u^i = K* f, for which the existence of solutions, i.e., the existence of the inverse {K* K -\- is granted 

by Fredholm's Alternative Theorem (see, e.g., [6]) for all fi in the resolvent set of K* K and, therefore, for all /i > 
(since K* K is a positive compact operator Jj. It is then a technical matter to show that the limit lim exists and 

>0 

provides a uniform approximation to a solution of H25I) . 

Tikhonov, however, does not point to the relation of his ideas to the theory of the Moore-Penrose inverse. This 
will be described in what follows. Our first result, presented in the next two lemmas, establishes that the limits 
lim A* (A A* + ul™) and lim (A* A + iil„) A* , described above, indeed exist and are equal. 

Lemma 4.1 Let A G Mat (C, m, n) and let /j, £ C be such that AA* + ^ilm and A* A + are non-singular (that 
means fj, ^ a[AA*) U a[A*A), a finite set). Then, A* [AA* + ^Im)^^ = {A* A + filn)~^ A* . □ 

Recall that, by Proposition I A. 71 a(^AA*^ and a(^A*A^ differ at most by the element 0. 
Proof of LemmallT] Let A* [AA* + film)~^ and := [A*A + fil„)~^A*. We have 



A*AB^ = A* [AA*] [AA* + ^Im) ^ = A* [AA* + fil^ - ^1™] [AA* + ^ 



= A* (im - fJ-{AA* + fil^) j ^ A* ~nB^. 
Therefore, [A* A + ^1„)_B^ = A*, what implies B^ = [A* A + filn)^^A* = C^. ■ 

Lemma 4.2 For all A G Mat (C, m, n) the limits lim A* (AA* + ulm) ^ and lim (A* A + uln) ^A* exist and are 
equal (by Lemma\4-1\), defining an element o/Mat(C, n, m). □ 



Proof. Notice first that A is an identically zero matrix iff AA* or A* A are zero matrices. In fact, if, for instance, 
A* A = 0, then for any vector x one has — {x, A* Ax) — {Ax, Ax) — \\Ax\\'^, proving that ^ = 0. Hence we will 
assume that AA* and A* A are non-zero matrices. 

The matrix AA* £ Mat (C, m) is evidently self-adjoint. Let ai, . . . , q,- be its distinct eigenvalues. By the Spectral 
Theorem for self-adjoint matrices, (see Theorems IA.9I and IA.13|) we may write 

r 

A A* = ^Q„£;, , (26) 

a = l 

where Ea are the spectral projectors of AA* and satisfy EaEb — SatEa, E^ ~ Ea and X]I=i ~ 1™- Therefore, 

r 

AA* + = ^(Oa -f ^^)Ea 

a = l 

and, hence, for fj, ^ {ai, . . . , Or}, one has, by (|50p . 



_ 1 - 1 
(^^* -Fpl™)"' = ^ — ■^'^ and ^*(yl^* -Hpl™)"' = —A'Ea. (27) 

a— 1 a— 1 

There are now two cases to be considered: 1. zero is not an eigenvalue of A A* and 2. zero is eigenvalue of AA* . 
In case 1, it is clear from (|27|l that the limit lim A* (AA* + /.ilm) exists and 

lim^*(Ayl*+^l„)"' = Y^—A*Ea. (28) 

a — 1 

In case 2^ let us have, say, ai = 0. The corresponding spectral projector E\ projects on the kernel of AA*\ 
Ker (AA*) := {u € C"| AA*u = 0}. If x € Ker {AA*), then A*x = 0, because = {x, AA*x) = (A*x, A^'x) = 



1 1 2 

L4*x- . Therefore, 



A* El = (29) 



■^Tikhonov's argument in [J is actually more complicated, since he does not consider the regularized equation (K* K + fil^u^ = K* f , but 
a more general version where the identity operator 1 is replaced by a Sturm-Liouville operator. 
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and, hence, we may write. 



A* (AA* + fil^)-^ = y — —A*Ea 



a = 2 

from which we get 

lim A* {AA* + fil^y^ = J^—A'Ea. (30) 

a — J 

This proves that lim A'iAA* + film) ^ always exists. By Lemma l4.ll the limit lim (A* A + /iln) ^ A* also exists 
and coincides with lim A" [AA* + ulm) ^ ■ B 

The main consequence is the following theorem, which contains a general proof for the existence of the Moore- 
Penrose pseudoinverse: 

Theorem 4.3 (Tikhonov's Regularization) For all A £ Mat (C, m, n) one has 

A+ = lim A* [AA* + fil,^)'^ (31) 

and 

A+ = lim{A*A + fj.l„y^A*. (32) 

□ 

Proof. The statements to be proven are evident if = Omn because, as we already saw, (Omn)"'^ ~ Onm. Hence, we 
will assume that yl is a non-zero matrix. This is equivalent (by the comments found in the proof o Lemma 14.21) to 
assume, that A A* and A* A are non-zero matrices. 

By Lemmas 14.11 and 14.21 it is enough to prove (|31[) . There are two cases to be considered: 1. zero is not an 
eigenvalue of A A* and 2. zero is an eigenvalue of AA* . In case 1., we saw in (|28|l . that 



^ 1 

Mm A* (AA* + film)'^ = — A*Ea 



B . 

M-s-O ' ' ' aa 

a— 1 

Notice now that 

r^/r \ r r ^ r 

AB = ^—AA'Ea = J2~ {J2'^>'^''] = J2J2~'^>' ^'"''^'^ = ^Ec, = Im , (33) 

a=l a = l V6=l / a = l 6=1 a=l 

which is self-adjoint and that 

BA = Y —A*EaA , (34) 

1 

which is also self-adjoint, because aa £ R for all a and because [A* EaA)* = A* EaA for all a, since iS^J = iJa. 
From (1331) it follows that ABA = yl. From iMl) it follows that 



BAB = [Y^A'EaA] \ Y—A'eA = 



Now, by the spectral decomposition (|26p for AA* , it follows that (^j4*)_E(, — atEt. Therefore, 



a — 1 b—1 \a — l / ^ b=l ^ 



This proves that A — A'^ when is not an eigenvalue of AA* . 

Let is now consider the case when AA* has a zero eigenvalue, say, ai. As we saw in (f30l 



^ 1 

lim A* (AA* + filmy ^ = y"—A*Ea =■ B 
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Using the fact that {AA'')Ea ~ otaEa (what follows from the spectral decomposition (|26|) for AA*), we get 

AB = Y,—^A*Ea = ^— = = , (35) 

a — A a — A a = 2 

which is self- adjoint, since Ei is self-adjoint. We also have 

BA = —A*EaA , (36) 



which is also self-adjoint. 

From ((SB, it follows that ABA ^ A- EiA. Notice now that (EiA)* = A*Ei = 0, by This establishes that 

EiA = and that ABA = A. From ((35}, it follows that 



BAB 



(±^^A'E.a) (t^^A'E^I = ±±-±-A'E.iAA*)E,. 
Using again {AA*)Et — a^Et, we get 



since EaEi — for a 7^ 1. This shows that BAB — B. Hence, we established that A = A'^ also in the case when AA* 
has a zero eigenvalue, completing the proof of (|31[) . H 



5 The Moore-Penrose Pseudoinverse and the Spectral Theorem 



The proof of Theorem 14.31 also establishes the following facts: 

Theorem 5.1 Let A £ Mat (C, m, n) be a non-zero matrix and let AA* = X]I=i ctaEa be the spectral representation 
of AA* , where {ai, . . . , ar} C R is the set of distinct eigenvalues of AA* and Ea are the corresponding self-adjoint 
spectral projections. Then, we have 



A^ 



E —A*Ea 



(37) 



a = l 



Analogously, let A* A = X]b=i PbFb be the spectral representation of A* A, where {/3i, . . . , /3s} C R is the set of distinct 
eigenvalues of A* A and Ft the corresponding self-adjoint spectral projections. Then, we also have 



(38) 



b=l 



Is it worth mentioning that, by Proposition \A.T\ the sets of non-zero eigenvalues of A A* and of A* A coincide: 
{ai, Q.}\{0} = {/3i, /34\{0};. 

From \3T\ ) and fSSH it follows that for a non-zero matrix A we have 



( 



A-" 



i- Ii 



a = l 



[da — ai 



A* 



\ '=1 



Y[ (^AA* - ail^^ 



1 = 1 



(39) 



b=i \ 1=1 



Yl (^A*A - I3lln) 



L = l 
l^b 



A* 



(40) 



□ 
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Expressions H39p or (|40|l provide a general algorithm for the computation of the Moore-Penrose pseudoinverse for 
any non-zero matrix A. Its implementation requires only the determination of the eigenvalues of AA* or of A* A and 
the computation of polynomials on AA* or A* A. 

Proof of Theorem 15.11 Eq. (|37|) was established in the proof of Theorem 14.31 fsee 1)28^ and H30|) V Relation (|38[) can be 
proven analogously, but it also follows easier (see H37p ). by replacing A ^ A* and taking the adjoint of the resulting 
expression. Relations (|39|l and (|40|l follow from Proposition I A . 1 ll particularly from the explicit formula for the spectral 
projector given in (|52[) . H 



6 The Moore-Penrose Pseudoinverse and Least Squares 

Let us now consider one of the main applications of the Moore-Penrose pseudoinverse, namely, to optimization of 
linear least squares problems. Let A £ Mat (C, m, n) and y £ C" be given and consider the problem of finding 
X G C" satisfying the linear equation 

Ax = y . (41) 

If m = n and A has an inverse, the (unique) solution is, evidently, x = A~^y. In the other cases the solution may 
not exist or may not be unique. We can, however, consider the alternative problem of finding the set of all vectors 
x' € C" such that the Euclidean norm \\Ax' — y\\ reaches its least possible value. This set is called the mimmizing set 
of the linear problem (|41|) . Such vectors x' £ C" would be the best approximants for the solution of H41[) in terms of 
the Euclidean norm, i.e., in terms of "least squares". As we will show, the Moore-Penrose pseudoinverse provides this 
set of vectors x' that minimize \\Ax' — y\\- The main result is condensed in the following theorem: 

Theorem 6.1 Let A £ Mat (C, m, n) and y G C™ be given. Then, the set of all vectors of C" for which the map 
C" 9 T 1— >■ \\Ax — y\\ G [0, oo) assumes a minimum coincides with the set 

A+y + Ker [A) = | A+y + (l„ - A+A)z, 2 G C"| . (42) 

By Provosition \y.y[ we also have A'^y + Ker {A) = A'^y + Ran [A'^)^ . □ 

Theorem 16. 1 I savs that the minimizing set of the linear problem H4ip consists of all vector obtained by adding to the 
vector A'^y an element of the kernel of A, i.e., to all vectors obtained adding to A^y a vector annihilated by A. Notice 
that for the elements x' of the minimizing set of the linear problem (|4ip one has j| Ax' — y|| = j| (Aj4+ — lm)y j| = ||-P2y|| , 
which vanishes if and only if y G Ker (P2) = Ran [A) (by Proposition 13. 3|l . a rather obvious fact. 

Proof of Theorem 16.11 The image of A, Ran(yl), is a closed linear subspace of C". The Best Approximant Theorem 
and the Orthogonal Decomposition Theorem guarantee the existence of a unique yo G Ran (A) such that \\yo — y^ is 
minimal, and that this yo is such that yo — y is orthogonal to Ran [A) . 

Hence, there exists at least one xq G C" such that ||j4a:o — J/|| is minimal. Such xo is not necessarily unique and, 
as one easily sees, xi G C" has the same properties if and only if — xi G Ker [A) (since Axq = yo and Axi = yo, 
by the uniqueness of yo)- As we already observed, Axo ~ y is orthogonal to Ran (A), i.e., {{Axo — y), Au) = for all 
G C". This means that (^(^A*Axo — A*y), u'^ — for all u G C" and, therefore, xo satisfies 

A* Axo = A*y . (43) 

Now, relation (|18p shows us that xo — A^y satisfies (|43|l . because A* AA^y — A*y. Therefore, we conclude that the 
set of all a; G C" satisfying the condition of \\Ax — y|| being minimal is composed by all vectors of the form A'^y -\- x\ 
with xi G Ker(yl). By Proposition 13.31 xi is of the form xi = (l„ — A'^A^z for some z G C", completing the proof. H 



Appendices 

A A Brief Review of Hilbert Space Theory and Linear Algebra 

In this appendix we collect the more important definitions and results on Linear Algebra and Hilbert space theory 
that we used in the main part of this paper. For the benefit of the reader, especially of students, we provide all results 
with proofs. 
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Hilbert spaces. Basic definitions 

A scalar product in a complex vector space V is a function V x V — C, denoted here by {•, •), such that the following 
conditions are satisfied: 1. For all ?t G V one has (u, k) > and (u, u) = if and only if it = 0; 2. for all u, ui, W2 £ V 
and all ai, a2 G C one has (u, {aivi+oi2V2)) = oii{u, vi) + oi2{u, 112) and ((qi-!;i + Q2«2), M) = aT(«i, it)+Q2{«2, u); 
3. {u, v) = {v, u) for all u, v £V. 

The norm associated to the scalar product (•, •) is defined by ||u|| := y/ {u, u), for all u G V. As one easily 
verifies using the defining properties of a scalar product, this norm satisfies the so-called parallelogram identity: for all 
a, 6 G V, one has 

||a + b||' + ||a-b||' = 2||af +2||b||' . (44) 

We say that a sequence {vn G V, n G N} of vectors in V converges to an element v G V if for all e > there exists a 
N{e) G IhJ such that ||t;„ — v\\ < e for all n > N{e). In this case we write v G lim„^oo «n- A sequence {vn G V, n G IhJ} 
of vectors in V is said to be a Cauchy sequence if for all e > there exists a N{e) G IN such that — HmH < e for all 
n, m G N such that n > A''(e) and m > N{e). A complex vector space V is said to be a Hilbert space if it has a scalar 
product and if it is complete, i.e., if all Cauchy sequences in V converge to an element of V. 



The Best Approximant Theorem 

A subset y4 of a Hilbert space H is said to be convex if for all u, v £ A and all /i G [0, 1] one has fiu + (1 — I-l)v G j4. A 
subset y4 of a Hilbert space H is said to be closed if every sequence {un £ A, n G N} of elements of A that converges in 
H converges to an element of A. The following theorem is of fundamental importance in the theory of Hilbert spaces. 

Theorem A.l (Best Approximant Theorem) Let A be a convex and closed subset of a Hilbert space H. Then, 
for all X £% there exists a unique y £ A such that \\x — y\\ equals the smallest possible distance between x and A, that 
means, \\x — y\\ = inf^/g^ — . □ 

Proof. Let D > be defined by D = intyi^A \\x — y'\\. For each n G It^ let us choose a vector yn £ A with the property 
that ||a; — j/„||'^ < + i. Such a choice is always possible, by the definition of the infimum of a set of real numbers 
bounded from below. 

Let us now prove that the sequence yn, n £ IN is a Cauchy sequence in H. Let us take a — x ~ y„ and b = x — ym 
in the parallelogram identity (|44|) . Then, ||2a; — (j/m + j/n)|| + \\ym — 2/n|P = 2\\x — j/np + 2||a:: — This can be 

written as ||ym -J/n||^ = 2\\x - yn\\^ + 2\\x - ymW^ - ^x - {y,n + yn)/2\\^ . Now, using the fact that \\x-yn\\^ < + 
for each n G N, we get 

WVm-ynf < 4D^ + 2(- + — ] -■i\\x-{ym+yn)/2f . 

\n m / " " 
Since (j/m + y„)/2 G A the left hand side is a convex linear combination of elements of the convex set A. Hence, by 
the definition of D, \\x — {ym + yn)/2|| > . Therefore, we have 

hm-ynf < 41)2 + 2 

The right hand side can be made arbitrarily small, by taking both m and n large enough, proving that {y„}„gN is a 
Cauchy sequence. Since yl is a closed subspace of the complete space H, the sequence {yn}ne\n converges to y G ^. 
Now we prove that \\x — y\\ = D. In fact, for all n G IN one has 















-AD^ = 21- + 






m / 


\ n 


m 1 



= ||(a;-yn) - (j/-yn)|| < - yn|| + ||y - i/n|| < + ^ 



+ - + . 

n 

Tailing n — > 00 and using the fact that {/„ converges to y, we conclude that ||a; — < D. One the other hand 
~ 2/11 ^ by the definition of D and we must have ||a; — = D. 

At last, it remains to prove the uniqueness of y. Assume that there is another y' £ A such that ||a; — y'|| — D. 
Using again the parallelogram identity (|44|) . but now with a = x — y and b — x — y' we get 

||2x — (y + j;')||^ + ||y — y'll^ — 2||x- — j/||^ + 2||a; — — AD^ , 

that means, 

\\y-y'\\^ = 4D2-||2x-(y + j/')||' = -4\x-{y + y')/2^ . 

Since {y + y')/2 £ A (for A being convex) it follows that — (j/ + j/')/2||2 > and, hence, ||y — y'^ < 0, proving 
that y = y' ■ M 
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Orthogonal complements 

If _E is a subset of a Hilbert space H, we define its orthogonal complement as the set of of vectors in H orthogonal to 
all vectors in E: E = |y G n\ {y, x) ^0 for all x e sj. The following proposition is of fundamental importance: 

Proposition A. 2 The orthogonal complement E-^ of a subset E of a Hilbert space H is a closed linear subspace of 
H. □ 

Proof. \i X, y £ E^ , then, for any q, /3 € C, one has {ax + Py, z) = a{a;, z) + z) = Q for any z £ E, showing 
that ax + Py € E^ . Elence, E^ is a linear subspace of H. If a;„ is a sequence in E^ converging to x € H, then, for all 
z € E one has {x, z) — ( lim x^, z\ = lim {x„, z) = 0, since {a;„, z) — for all n. Hence, x € i?^, showing that 

\n— ^oo / n — ^oc 

E^ is closed. Above, in the first equality, we used the continuity of the scalar product. H 



The Orthogonal Decomposition Theorem 

Theorem A. 3 (Orthogonal Decomposition Theorem) Let M be a closed and linear (and therefore convex) sub- 
space of a Hilbert space H. Then every x can be written in a unique way in the form x = y + z, with y £ A4 and 
z e The vector y is such that \\x — y\\ = infj,/g^ — y'\\, i.e., is the best approximant of x in M. □ 

Proof. Let x be an arbitrary element of H. Since M is convex and closed, let us evoke Theorem I A . II and choose y as 
the (unique) element of M such that |ja; — y\\ — inf^/g^ ||a; — Defining z := x — y all we have to do is to show 
that z and to show uniqueness of y and z. Let us first prove that z . By the definition of y one has 

ll^; — y\\'^ < ||i — y — for all A £ C and all y' G Ai. By the definition of z, it follows that < \\z — Xy'\\ for 

all A G C. Writing the right hand side as — \y' , z — \y') we get, \\z\\'^ < — 2Re(A(2:, j/')) + |A|^||y'||^. Hence, 

2Re{X{z, y')) < \X\^\\y'f . (45) 

Now, write {z, y') — \ {z, y')|e'", for some a G R. Since (|45| holds for all A G C, we can pick A in the form A = fe~'", 
t>0 and (|45p becomes 2t\{z, y')\ < Hence, < for all t > 0. But this is only possible if the 

left hand side vanishes: y') | = 0. Since y' is an arbitrary element of M, this shows that z € M^. 

To prove uniqueness, assume that x — y' + z' with y' ^ M and z' G . We would have y — y' = z' — z. But 
y — y' £ M and z — z£ . Hence, both belong to n M'^ = {0}, showing that y — y' = z — z = Q. ■ 



The spectrum of a matrix 

The spectrum of a matrix A G Mat (C, n), denoted by (t{A), is the set of all A G C for which the matrix Al — yl has 
no inverse. 

The characteristic polynomial of a matrix A G Mat (C, n) is defined by pa{z) ■= det(2l — A). It is clearly a 
polynomial of degree n on z. It follows readily from these definitions that (j{A) coincides with the roots of pA- The 
elements of o-{A) are said to be the eigenvalues of A. If A is an eigenvalue of A, the matrix A — \1 has no inverse 
and, therefore, there exists at least one non-vanishing vector w G C" such that {A — \l)v = 0, that means, such that 
Av — \v. Such a vector is said to be an eigenvector of A with eigenvalue A. The set of all eigenvectors associated to 
a given eigenvalues (plus the null vector) is a linear subspace of C", as one easily sees. 

The multiplicity of a root A of the characteristic polynomial of a matrix A G Mat (C, n) is called the algebraic 
multiplicity of the eigenvalue A. The dimension of the subspace generated by the eigenvectors associated to the 
eigenvalues A is called the geometric multiplicity of the eigenvalue A. The algebraic multiplicity of an eigenvalue is 
always larger than or equal to its geometric multiplicity. 

The neighborhood of singular matrices 

Proposition A. 4 Let A G Mat (C, n) be arbitrary and let B G Mat (C, n) be a non-singular matrix. Then, there 
exist constants Mi and M2 (depending on A and B) with < Mi < M2 such that A + /^-B is invertible for all fi £ C 
with < 1^1 < Ml and for all fi G C with \^\ > M2. □ 
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Proof. Since B has an inverse, we may write A + fiB — (/il + AB^^^ B. Hence, A + fiB has an inverse if and only if 
/il + AB^^ is non-singular. 

Let C = —AB~^ and let {Ai, . . . , A„} C C be the n not necessarily distinct roots of the characteristic polynomial 
PC of C. If all roots vanish, we take Mi = M2 > 0, arbitrary. Otherwise, let us define Mi :— min{|Afe|, Afe 7^ 0} and 
M2 := max{|Afe|, k — 1, . . . , n}. Then, the sets {/i G C] < < Mi} and {/i e C| > A'h} do not contain roots 
of PC and, therefore, for fj, in these sets, the matrix /^l — C = pi + AB~^ is non-singular. H 



Similar matrices 

Two matrices A £ Mat (C, n) and B £ Mat (C, n) are said to be similar if there is a non-singular matrix P £ 
Mat (C, n) such that P^^AP — B. One has the following elementary fact: 

Proposition A. 5 Let A and B £ Mat (C, n) be two similar matrices. Then their characteristic polynomials coincide, 
Pa = Pb, and, therefore, their spectra also coincide, o-{A) — cr{B), as well as the geometric multiplicities of their 
eigenvalues □ 



Proof. Let P G Mat (C, n) be such that P'^AP = B. Then, pa{z) = det(2l - A) ^ det [p-^{zt - A)Pj = 
det [zl - P"MP) = det(zl ~ B) = pb{z), for all z e C. ■ 



The spectrum of products of matrices 

The next proposition contains a non-evident consequence of Propositions IA.5I and IA.4I 

Proposition A. 6 Let A, B £ Mat (C, n). Then, the characteristic polynomials of the matrices AB and BA coincide: 
Pab ~ Pba- Therefore, their spectra also coincide, a(AB) — a{BA), as well as the geometric multiplicities of their 
eigenvalues. □ 



Proof. IF A or B (or both) are non-singular, then AB and BA are similar. In fact, in the first case we can write 
AB = A{BA)A''^ and in the second one has AB — B~^ {BA)B . In both cases the claim follows from Proposition 
IA.5I Let us now consider the case where neither A nor B are invertible. We know from Proposition IA.4I that there 
exists M > such that A + fil is non-singular for all /i G C with < < M. Hence, for such values of fi, we have 
by the argument above that P(a+^ii)b = Ps(a+mi)- Now the coefficient of the polynomials P(a+mi)s s^nd Pb(a+hi) are 
polynomials in fi and, therefore, are continuous. Hence, the equality P(a+/ji)s = PB(A+fj,i) remains valid by taking the 
limit /i — i> 0, leading to pab — Pba- M 

Proposition IA.6l can be extended to products of non-square matrices: 

Proposition A. 7 Let A G Mat (C, m, n) and B G Mat (C, n, m). Clearly, AB G Mat (C, m) and BA G Mat (C, n). 

Then, one has x"pab{x) = x^pba{x). Therefore, o{AB)\{Qi} — a{BA)\{0} , i.e., the set of non-zero eigenvalues of 
AB coincide with the set of non-zero eigenvalues of BA. □ 



Proof. Consider the two (m + n) x (m -|- n) matrices defined by 

A' := (^^ and B' ^ ^ 

See (IHl). It is easy to see that 



AB (Dm, „\ , , / BA 



A'B' = and that B' A = 

From this, it is now easy to see that pa'b'{x) = x"pab{x) and that PB'A'ix) — x"^pba{x). By Proposition IA.6I one 
has pa'b'{x) ~ Pb'A'(x), completing the proof. H 
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Diagonalizable matrices 

A matrix A £ Mat (C, n) is said to be diagonalizable if it is similar to a diagonal matrix. Hence A £ Mat (C, n) is 
diagonalizable if there exists a non-singular matrix A G Mat (C, n) such that P^^AP is diagonal. The next theorem 
gives a necessary and sufficient condition for a matrix to be diagonalizable: 

Theorem A. 8 A matrix A £ Mat (C, n) ts diagonalizable if and only if it has n linearly independent eigenvectors, 
I.e., it the subspace generated by its eigenvectors is n dimensional. □ 

Proof. Let us assume that A has n linearly independent eigenvectors {ii^, u"}, whose eigenvalues are {di, dn}, 
respectively. Let P € Mat (C, n) be defined by P = . . . , u"J . By one has 

AP = [Av\ Av'^j = [di«\ d„w"] 

and by l|13p one has |diu^, . . . , dn«"| = PD. Therefore AP — PD. Since the columns of P are linearly independent, 

P is non-singular and one has P~^AP = D, showing that A is diagonalizable. 

Let us now assume that A is diagonalizable and that there is a non-singular P £ Mat (C, n) such that P~^AP = 
D = diag(di, dn) ■ It is evident that the vectors of the canonical base (|10p are eigenvectors of D, with 

Dea — daGa- Therefore, Va ~ Pea are eigenvectors of A, since Ava ~ APea ~ PDea — P[daBa) = daPea = daVa- To 
show that these vectors Va are linearly independent, assume that there are complex numbers ai, . . . , a„ such that 
aiVi -|- • • • -|- OnVn — 0. Multiplying by P~^ from the left, we get aiei -I- • • • -I- a„e„ — 0, implying ai = ■ • ■ = q„ = 0, 
since the elements ea of the canonical basis are linearly independent. H 



The Spectral Theorem is one of the fundamental results of Functional Analysis and its version for bounded and 
unbounded self-adjoint operators in Elilbert spaces is of fundamental importance for the so-called probabilistic inter- 
pretation of Quantum Mechanics. Here we prove its simplest version for square matrices. 

Theorem A. 9 (Spectral Theorem for Matrices) A matrix A G Mat (C, n) is diagonalizable if and only if there 
exist r G hi, 1 < r < n, scalars ai, . . . , Or £ C and non-zero distinct projectors Ei, . . . , Er £ Mat (C, n) such that 

r 

A = ^a„£, , (46) 

and 

r 

with EiEj — Si^jEj. The numbers ai, . . . , a^ are the distinct eigenvalues of A. □ 

The projectors Ea in (|46p are called the spectral projectors of A. The decomposition H46p is called spectral de- 
composition of A. In Proposition lA.llI we will show how the spectral projections Ea can be expressed in terms of 
polynomials in A. In Proposition IA.12I we establish the uniqueness of the spectral decomposition of a diagonalizable 
matrix. 

Proof of Theorem IA.9I If A G Mat (C, n) is diagonalizable, there exists P G Mat (C, n) such that P'^AP = D = 
diag (Ai, . . . , A„), where Ai, . . . , A„ are the eigenvalues of A. Let us denote by {oi, . . . , a^}, 1 < r < n, the set of 
all distinct eigenvalues of A. 

One can clearly write D — X]I=i ctaKa, where Ka G Mat (C, n) are diagonal matrices having or 1 as diagonal 
elements, so that 

(1 , ii i = j and {D)ii = a^ , 
, if i = j and {D)ii / Ua , 
, if i / i . 

Hence, {Ka)ij = 1 if z = j and {D)ii = aa and {Ka)ij = otherwise. It is trivial to see that 

r 

^i^. = 1 (48) 

a-l 

and that 

KaKt = Sa,bKa. (49) 

Since A — PDP~ , one has A = ELi^aBa , where Ea ■= PKaP~^. It is easy to prove from (|48p that 
1 — X]L=i and it is easy to prove from (|48p that EiEj — Si^ jEj. 
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Reciprocally, let us now assume that A has a representation like (|46|) . with the -E^'s having the above mentioned 
properties. Let us first notice that for any vector x and for k G {1, . . . , r}, one has by H46p 

r 

AEkX = '^ctjEjEkX = akEkX . 

Hence, EkX is either zero or is an eigenvalue of A. Therefore, the subspace 5 generated by all vectors {EkX, x £ 
C", k = 1, . . . , r} is a subspace of the space A generated by all eigenvectors of A. However, from (|47p . one has, 
for all X G X = Ix = Y,k=i and this reveals that C" = 5 C ^. Hence, A ^ and by Theorem fOl A is 
diagonalizable. H 

The Spectral Theorem has the following corollary, known as the functional calculus: 

T 

Theorem A. 10 (Functional Calculus) Let A £ Mat (C, n) he diagonaUzable and let A = ^^Qa£a be its spectral 

a = l 

r 

decomposition. Then, for any polynomial p one has p{A) = '^^p{aa)Ea- □ 

r r 

Proof. By the properties of the spectral projectors i5a, one sees easily that = aaCtbEaEt = aaCtbSa. bEa = 

a . b—1 a, 6—1 

r r 

^^(aa)^£a. It is then easy to prove by induction that A"^ = "^^{aa)"^ Ea, for all m G IhJo (by adopting the convention 

a— 1 a— 1 

that A'^ — 1, the case m = is simply (I47|) '). From this, the rest of the proof is elementary. H 
One can also easily show that for a non-singular diagonalizable matrix A G Mat (C, n) one has 

^-'-E^^- (50) 

a — 1 

Getting the spectral projections 

One of the most useful consequences of the functional calculus is an explicit formula for the spectral projections of a 
diagonalizable matrix A in terms of a polynomial on A. 

Proposition A. 11 Let A G Mat (C, n) be non-zero and diagonalizable and let A = aiEi + ■ ■ ■ + OrEr be its spectral 
decomposition. Let the polynomials pj, j = 1, . . . , r, be defined by 

p^(-) - n (^) ■ (51) 



L = l 



Then, 



= MA)= n n - -'i) (52) 

\ k=l ^ I 1=1 

for all j = 1, . . . , r. □ 



Proof. By the definition of the polynomials Pj, it is evident that pj{ak) = 5j. fe. Hence, by Theorem I A. 101 Pj{A) 
T,l=iPj{ak)Ek = Ej. 
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Uniqueness of the spectral decomposition 

Proposition A. 12 The spectral decomposition of a diagonaUzahle matrix A £ Mat (C, n) is unique. 



□ 



Proof. Let A = ^^a^iSfe be the spectral decomposition of A as described in Tlieorem IA.9I wliere ajj, A; = 1, . . . , r, 

s 

witli 1 < r < n are tlie distinct eigenvalues of A, Let A — PkFu be a second representation of A, where the /3fe's are 

s 

distinct and where the F^s are non- vanishing and satisfy FjFi — SjjFi and 1 = ^^-Ffc- For a vector a; 7^ it holds 

k = l 

X = X]fe=i FkX, so that not all vectors FkX vanish. Let FkgX 7^ 0. One has AFkgX = X]fe=i PkFkFk^x = PkgFkgX. This 
shows that Pkg is one of the eigenvalues of A and, hence, {/3i, . . . , /3s} C {qi, . . . , a^} and we must have s < r. Let 
us order both sets such that j3k = Qk for all 1 < A; < s. Hence, 

r s 

A = ^afcSfc = ^QfcFfc . (53) 

Now, consider the polynomials pj, j = 1, . . . , r, defined in (|51|) . for which pj(aj) — 1 and pj(ak) = for all k 7^ j. 
By the functional calculus, it follows from ((53} that, for 1 < j < s, 

r s 

Pi(^) = = ^Pj{oik)Fk , .-. Ej = i^j . 



(The equality pj{A) = X]fc=i Pi ('3^fc)Jfc follows from the fact that the EkS and the FkS satisfy the same algebraic 
relations and, hence, the functional calculus also holds for the representation of A in terms of the Fk's). Since 

r s r 

1 = ^Ek ^ ^Ek, and Ej = Fj for all 1 < j < s, one has Ek = 'D. Hence, multiplying by Ei, with 

fc=l fe = l fc = s + l 

s + 1 < I < r, it follows that Ei = <D for all s + 1 < Z < r. This is only possible if r = s, since the Ek's are 
non-vanishing. This completes the proof. H 



Self-adjointness and diagonalizability 

Let A £ Mat (C, m, n). The adjoint matrix A* G Mat (C, n, m) is defined as the unique matrix for which the equality 

(«, Av") = {A*u, t;) 

holds for all it G and all u £ C". If are the matrix elements of A in the canonical basis, it is an easy exercise to 
show that = Aji, where the bar denotes complex conjugation. It is trivial to prove that the following properties 

hold: 1. {aiAi +02^2)* = aTAl +02^2 for all Ai, A2 G Mat (C, m, n) and aU ai, a2 G C; 2. {AB)* = _B*A* for 
all A G Mat (C, m, n) and B G Mat (C, p, m); 3. A** = [A*)* = A for all A G Mat (C, m, n). 

A square matrix A G Mat (C, n) is said to be self-adjoint if A — A* . A square matrix U G Mat (C, n) is said to 
be unitary if — U* . Self-adjoint matrices have real eigenvalues. In fact, if A is self-adjoint, A G ct{A) and v G C" 
is a normalized (i.e., = 1) eigenvector of A with eigenvalue A, then A = A{i', v) = {v, Xv) = {v, Av) = {Av, v) = 
{Xv, v) = X{v, v) = A, showing that A G R. 

Projectors and orthogonal projectors 

A matrix E G Mat (C, n) is said to be a projector if E^ — E and it is said to be a orthogonal projector if it is a 
self-adjoint projector: E^ = E and E* — E. An important example of an orthogonal projector is the following. Let 
I) G C" be such that — 1 and define, 

P„u := {v, u) V , (54) 

for each u G C". In the canonical basis, the matrix elements of Pv are given by {Pv) ~ vjvi, where the Vk^s are the 
components of 11. One has, 

Pju — {v, u) PvV = {v, u) {v, v) V — {v, u) V = PvU , 
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proving that — Pv On the other hand, for any a, b € C" we get 

{a, Pyb) = (a, {v, b) v) = {v, b) {a, v) = (^{a, v) v, 6^ = {{v, a) v, b) = (P„a, b) , 

showing that Py = Pv Another relevant fact is that if vi and V2 are orthogonal unit vectors, i.e., {vi, Vj) — Sij, then 
PviPv2 = Pv2Pvi = 0. In fact, for any a G C" one has 

Pvi{Pv2a') = Pvi{{v2, a) V2) = {v2, a) Py^V2 = {v2, a) {vi, V2) vi = 0. 

This shows that Pv^Pv^ = and, since both are self-adjoint, one has also Pv^Pvi ~ 0. 

Spectral Theorem for self-adjoint matrices 

The following theorem establishes a fundamental fact about self-adjoint matrices. 

Theorem A. 13 (Spectral Theorem for Self-adjoint Matrices) If A £ Mat (C, n) is self-adjoint, one can find 
a orthonormal set {vi, . . . , u„} of eigenvectors of A with real eigenvalues Ai, . . . , A„, respectively, and one has the 
spectral representation 

A = AiP„i + ■■■+ A„P„„ , (55) 

where Py^u := {v^, u)vk satisfy P*^ = P„j^ and PvjPv^ = Sj^Pv^ and one has YlZ^i P^k ^ 1- 

Therefore, if A £ Mat (C, n) is a self-adjoint matrix it is diagonalizable. Moreover, there is a unitary P £ 
Mat (C, n) such that P~^AP = diag (Ai, . . . , A„). □ 

Proof. Let Ai G R be an eigenvalue of A and let vi be a corresponding eigenvector. Let us choose ||wi|| = 1. Define 
Ai £ Mat (C, n) by Ai := A ~ AiPi,^. Since both A and Pv^ are self-adjoint, so is Ai, since Ai is real. 

It is easy to check that AiVi — 0. Moreover, [ui]^, the subspace orthogonal to vi, is invariant under the action of 
Ai. In fact, for w € [vi]'^ one has {Aiw, vi) = {w, Aivi) = 0, showing that Aiw G 

It is therefore obvious that the restriction of Ai to [vi]^ is also a self-adjoint operator. Let 172 G [I'l]"'" be an 
eigenvector of this self-adjoint restriction with eigenvalues A2 and choose ||i;2|| = 1. Define 

A2 ■- Ai-\2Pv2 = A - XiPy^ - X2PV2 ■ 

Since A2 is real, A2 is self-adjoint. Moreover, A2 annihilates the vectors in the subspace [vi, V2] and keeps [vi, V2]^ 
invariant. In fact, A2V1 — Avi — AiP^^tii — X2PV2V1 = AiWi — Ai?;i —X2{v2, Vi)v2 = 0, since {v2, fi) — 0. Analogously, 
A2V2 = A1V2 — X2PV2V2 ~ X2V2 — X2V2 = 0. Finally, for any a, G C and w G [vi, 112]^ one has (^A2W, {avi + I3v2)) — 
(w, A2{cevi -f PV2)) ~ 0, showing that [ui, ^2]^ is invariant by the action of A2. 

Proceeding inductively, we find a set of vectors {v\, . . . , Vn}, with \\vk\\ — 1 and with Va G [vi, . . . , Va-i\^ for 
2 < a < n, and a set of real numbers {Ai, . . . , A„} such that An = A~ AiP„j — ■ . ■ — XnPv^ annihilates the subspace 
[vi, . . . , Vn]. But, since . . . , v„} is an orthonormal set, one must have [vi, . . . , ii„] — C" and, therefore, we 
must have An = 0, meaning that 

A = AiP„, + ■ ■ ■ + A„P„„ . (56) 
One has Pvf,Pvi ~ 5k, i Pv^, since (vt, vi) = S^i- Moreover, since {vi, . . . , Vn} is a basis in C" one has 

X = aiVl H + OnVn (57) 

for all X G C". By taking the scalar product with Vk one gets that at = {vk, x) and, hence, 

X = (ni, a;)ui H h (tJn, x)vn = P^iSH + Pv„x = (Pi,i H -|-Pi,„)a;. 

Since x was an arbitrary element of C", we established that P^-^ +■•■-!- Pv^ ~ 1- 

It follows from (I56[) that Ava = XaVa- Hence, each Vk is an eigenvector of A with eigenvalue Afe. By Theorem 
IA.8I A is diagonalizable: there is P G Mat (C, n) such that P~^ AP — diag (Ai, . . . , A„). As we saw in the proof of 

Theorem IA.8I we can choose P = . . . , . This is, however, a unitary matrix, since, as one easily checks. 



P*P 



/ {Vx, V-l) ■ • ■ (Wl, VnY 
\{Vn, Wl) ■ • ■ {Vn, Vn}/ 



because {va, Vb) = 5a, b- 
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The Polar Decomposition Theorem for square matrices 

It is well-known that every complex number z can be written in the so-called polar form z — \z\e^^ , where \z\ > and 
6 £ [— TT, tt), with \z\ := y/Wz and e'® :— z\z\~^ . There is an analogous claim for square matrices A £ Mat (C, n). This 
is the content of the so-called Polar Decomposition Theorem, Theorem IA.14I below. Let us make some preliminary 
remarks. 

Let A G Mat (C, n) and consider A* A. One has {A* A)* = A* A** = A* A and, hence A* A is self-adjoint. 
By Theorem IXTSl we can find an orthonormal set {vk, k = 1, . . . , n} of eigenvectors of A* A, with eigenvalues 
dk, k — 1, . . . , n, respectively, with the matrix 

P - Li, t;J (58) 



being unitary and such that P*(yl*A)P — D .= diag(di, . . . , d„). One has dfc > since dfc||i;fe||^ — dk{vk, ffc) = 
(life, Bvk) = {vk, A*Avk) = {Avk, Avk) = ll^^ifelP and, hence, dk = ||^/||iifc ||^ > 0. 

Define D^^^ := diag (^dT, V<Q- One has (^D^^^^ = D. Moreover, (^D^^^Y = D^/^, since every is 

real. The non-negative numbers \/d7, • ■ ■ , x/djT are called the singular values of A. 
Define the matrix yA*A £ Mat (C, n) by 

^fWA := PD'^^P' . (59) 

The matrix VA'A is self-adjoint, since (VA'Ay = (^PD^/^p.y ^ PD^^^P' = VA*^. Notice that (^^/ A* A^ = 
ppi/2-)2p. ^ PL)P* = AM. From this, it follows that 

{det(^VA^YY = Aet (^{\fA^Y^ = Aet{A* A) = det(A*) det(A) = det(A) det(yl) = ]det(A)|^ 

Hence, det A*A^ — \ det(yl)| and, therefore, V A* A is invertible if and only if A is invertible. 

We will denote \/A*A by \ A\, following the analogy suggested by the complex numbers. Now we can formulate the 
Polar Decomposition Theorem for matrices: 

Theorem A. 14 (Polar Decomposition Theorem) If A £ Mat (C, n) there is a matrix U G Mat (C, n) such that 



A = U^/A'A . (60) 
If A is non-singular, then U is unique. The representation fgOlj is called the polar representation of A. □ 

Proof. As above, let dk, fc = 1, . . . , n be the eigenvalues of A* A and let Vk, fc = 1, . . . , n be a corresponding 
orthonormal set of eigenvalues: A* Avk = dkVk and {vk, vi) — Ski (see Theorem lA.lSp . 

Since d^ > we order them in a way that dk > for all fc = 1, . . . , r and dk = for all fc = r -I- 1, . . . , n. Hence, 

Avk = for all fc = r -I- 1, . . . , n , (61) 

because A* Avk = implies = {vk, A'Avk) ~ {Avk, Avk) ~ \\Avk\\^. 
For fc = 1, . . . , r, let Wfc be the vectors defined by 

Wk ■= —^Avk , fc = 1, . . . , r . (62) 



/dk 

It is easy to see that 

{wk, wi) = 2— {Avk, Avi) = —r^={A*Avk, vi) = —^^{vk, vi) = —^^Ski = 5ki , 
Vdkdi vdkdt vdkdi V dkdi 

for all fc, I = 1, . . . , r. Hence, {wk, fc = 1, . . . , r} is an orthonormal set. We can add to this set an additional 
orthonormal set {wk, k = r + 1, . . . , n}, in the orthogonal complement of the set generated by {wk, fc = 1, . . . , r} 
and get a new orthonormal set {wk, fc = 1, . . . , n} as a basis for C". 

Let P G Mat (C, n), be defined as in (|58p and let Q and U be elements of Mat (C, n) defined by 



Q — \\wi, wj , U — QP* 



Since {vk, fc = 1, . . . , n} and {wk, fc = 1, . . . , n} are orthonormal sets, one easily sees that P and Q are unitary 
and, therefore, U is also unitary. 
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It is easy to show that AP = QD^^^, where D^^^ := diag (\/d7, ■ • ■ , , In fact, 

W .„] P [Av,, Av^ [Av,, ...,Av.O, 

Mi K rr rr nil US 



AP 



[ySTwi, . . . , Vd^wr 0, . . . , o] y [wi, . . . , w„]73'''' = gip' 
Now, since AP = Q^D^/^ it follows that A = QD^'^P* = UPD^^^P* ^ UVA^, as we wanted to show. 



To show that U is uniquely determined if A is invertible, assume that there exists U' such that A — U V A* A = 
UWA*A. We noticed above that %/ A* A is invertible if and only if A is invertible. Hence, if A is invertible, the equality 
U V A* A — U' VA*A implies U = U' . If yl is not invertible the arbitrariness of U lies in the choice of the orthonormal 
set {wk, k — r + 1, . . . , n}. ■ 

The following corollary is elementary: 
Theorem A. 15 Let A G Mat (C, n). Then, there exists a unitary matrix V G Mat (C, n) such that 

A = VaA^V . (63) 
If A IS non-singular, then V is unique. □ 



Proof. For the matrix A*, relation (|60|) says that A* = Uq \/ (A*)* A* = Ui^yAA* for some unitary [7o- Since V AA* 
is self-adjoint, one has A = V AA* US . Identifying V = C/q , we get what we wanted. H 

The polar decomposition theorem can be generalized to bounded or closed unbounded operators acting on Hilbert 
spaces and even to C*-algebras. See e.g., [6] and [7]. 

Singular values decomposition 

The Polar Decomposition Theorem, Theorem I A. 141 has a corollary of particular interest. 

Theorem A. 16 (Singular Values Decomposition Theorem) Let A G Mat (C, n). Then, there exist unitary 
matrices V and W G Mat (C, n) such that 

A = VSW* , (64) 

where S G Mat (C, n) is a diagonal matrix whose diagonal elements are the singular values of A, i.e., the eigenvalues 
ofyfA^. □ 

Proof. The claim follows immediately from (|60j and from ((59)) by taking V ^UP ,W = P and S = D^^'^ . ■ 

Theorem lAT6l can be generalized to rectangular matrices. In what follows, m, n G It^ and we will use definitions 
Q, ([8]) and relation ((9]), that allows to injectively map rectangular matrices into certain square matrices. 

Theorem A. 17 (Singular Values Decomposition Theorem. General Form) Let A G Mat (C, m, n). Then, 
there exist unitary matrices V and W G Mat (C, m + n) such that 

A — Im, m+nVSW* Jm+n, n , (65) 

where S G Mat (C, m + n) is a diagonal matrix whose diagonal elements are the singular values of A' (defined in (E^), 
I.e., are the eigenvalues of ^ {A'Y A' . □ 

Proof. The matrix A' G Mat (C, m + n) is a square matrix and, by Theorem IA.16I it can be written in terms of a 
singular value decomposition A! — VSW* with V and W G Mat (C, m + n), both unitary, and 5* G Mat (C, m + n) 
being a diagonal matrix whose diagonal elements are the singular values of A! . Therefore, (|65|) follows from ((9)1. H 
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B Singular Values Decomposition and Existence of the Moore- 
Penrose Pseudoinverse 

We will now present a second proof of the existence of the Moore-Penrose pseudoinverse of a general matrix A G 
Mat (C, m, n) making use of Theorem lA. 161 We first consider square matrices and later consider general rectangular 
matrices. 

The Moore-Penrose pseudoinverse of square matrices 

Let us first consider square diagonal matrices. If D e Mat (C, n) is a diagonal matrix, its Moore-Penrose pseudoinverse 
is given by D'^ € Mat (C, n), where, for i = 1, . . . , n one has 

(TJ+\ ^ / {D^^y' , if A« / O, 

\ 0, ifA«=0. 

It is elementary to check that DD'^D = D, D'^DD^ — and that DD^ and D^D are self-adjoint. Actually, 
DD'^ — D^D, a diagonal matrix whose diagonal elements are either or 1: 



(DD+).. = (D+D).. = 



1 , if A» / , 
, if A» = . 



Now, let A G Mat (C, n) and let A — VSW* be its singular values decomposition fTheorem lA.16[l . We claim that 
its Moore-Penrose pseudoinverse is given by 

= W5'+T/* . (66) 

In fact, AA+A = {V SW*) {WS+V*) {V SW*) = VSS+SW+ = VSW* = A and 

A+AA+ = (H/S'+V*)(VS'M/*)(W5+V*) = WS+S-S+V* = WS+V* = A+ . 

Moreover, AA+ = {VSW*){WS+V*) = V{SS^)V* is self-adjoint, since SS+ is a diagonal matrix with diagonal 
elements or 1. Analogously, A+ A = {W S+V*){V SW*) = W{S+S)W* is self-adjoint. 

The Moore-Penrose pseudoinverse of rectangular matrices 

Consider now A € Mat (C, m, n) and let A' £ Mat (C, m + n) be the (m -\-n) x (m -I- n) defined in ([8]). Since A' is a 
square matrix it has, by the comments above, a unique Moore-Penrose pseudoinverse (A')"*" satisfying 

1. A' {A')^ A' ^ A' , 

3. and + are self-adjoint. 

In what follows we will show that A^ G Mat (C, n, m) is given by 

A := /t^^ rn + n^A ) Jm + n, m , C^*^) 

with the definitions ©-([S]), i.e.. 



A^ 



The starting point is the existence of the Moore-Penrose pseudoinverse of the square matrix A' . Relation A' (^4') '^A' - 

A' means, using definition dSJ, that Jm+n, mA^I„^ J„+„, mj A7„, = Jm+n, mAI^, m+n and from 

it follows, by multiplying to the left by Im, m+n and to the right by Jm+n, n, that AA'^A — A, one of the relations we 
wanted to prove. 

Relation {A')~^ A' {A')'^ = (4')^ means, using definition (gj, that {A')^ Jm+n, mAIn, m+n{A')'^ = {^')^ ■ Muhi- 
plying to the left by In, m+n and to the right by Jm+n, m, this establishes that = A^ . 

Since A' (^A') is self-adjoint, it follows from the definition ((8)1 that Jm+n, mAIn, m+n(^') is self-adjoint, i.e., 

Jm+n, mAIn, m+n ) — {^^Aln, m+n ) ^ Im, m+n • 
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Therefore, multiplying to left by Im, m+n and to the right by Jm+n, m, it follows from ((6| that 

+n, m — Im, m + n 

proving that AA'^ is self-adjoint 

Finally, since [A')'^ A' is self-adjoint, it follows from definition (|8]) that [A')^ Jm+n, mAIn, m+n is self-adjoint, i.e., 

{Ayjm+ n, m AIn 

Hence, multiplying to the left by /n, m+n and to the right by Jm+n, n, if follows from ((7|l that 

-^n, m + n n, m-^ — (^'^ ^ Jm + n, m-^^ Jm + n, n — m + n 

establishing that j4"''j4 is self-adjoint. This proves that given in (|67p is the Moore-Penrose pseudoinverse of A. 
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